## # A tibble: 19 x 2
## Q29 n
## <dbl> <int>
## 1 13 889
## 2 NA 735
## 3 5 654
## 4 4 557
## 5 8 347
## 6 3 180
## 7 11 173
## 8 16 151
## 9 10 128
## 10 7 119
## 11 17 111
## 12 1 84
## 13 12 71
## 14 18 69
## 15 6 53
## 16 2 21
## 17 14 13
## 18 9 6
## 19 15 3
## # A tibble: 5 x 2
## Q32 n
## <dbl> <int>
## 1 1 963
## 2 2 1063
## 3 3 1068
## 4 4 673
## 5 NA 597
## # A tibble: 5 x 2
## Q32 n
## <dbl> <int>
## 1 1 71
## 2 2 75
## 3 3 74
## 4 4 74
## 5 NA 441
## # A tibble: 13 x 2
## Q33 n
## <dbl> <int>
## 1 1 925
## 2 2 60
## 3 3 141
## 4 4 854
## 5 5 60
## 6 6 67
## 7 7 65
## 8 8 243
## 9 9 399
## 10 10 393
## 11 11 221
## 12 12 368
## 13 NA 568
## # A tibble: 13 x 2
## Q33 n
## <dbl> <int>
## 1 1 869
## 2 4 789
## 3 10 374
## 4 9 367
## 5 12 332
## 6 8 228
## 7 11 196
## 8 NA 130
## 9 3 117
## 10 6 61
## 11 5 57
## 12 7 56
## 13 2 53
## # A tibble: 6 x 2
## Q37 n
## <dbl> <int>
## 1 1 2833
## 2 2 954
## 3 3 28
## 4 4 44
## 5 1234 1
## 6 NA 504
Drop NAs for specific questions and filter out disciplines with fewer than 30 (the cutoff) students in sample
## # A tibble: 18 x 2
## Q29 n
## <dbl> <int>
## 1 13 622
## 2 5 486
## 3 4 398
## 4 8 241
## 5 16 120
## 6 3 111
## 7 11 111
## 8 7 95
## 9 10 95
## 10 17 81
## 11 1 70
## 12 12 57
## 13 18 42
## 14 6 34
## 15 2 16
## 16 14 8
## 17 9 3
## 18 15 1
## # A tibble: 14 x 2
## Q29 n
## <dbl> <int>
## 1 13 622
## 2 5 486
## 3 4 398
## 4 8 241
## 5 16 120
## 6 3 111
## 7 11 111
## 8 7 95
## 9 10 95
## 10 17 81
## 11 1 70
## 12 12 57
## 13 18 42
## 14 6 34
Major counts and percentages
| Q29 | major | n | pct_total | cumulat_pct |
|---|---|---|---|---|
| 13 | Mec | 622 | 24.27 | 24.27 |
| 5 | Che | 486 | 18.96 | 43.23 |
| 4 | Civ | 398 | 15.53 | 58.76 |
| 8 | Ele | 241 | 9.40 | 68.16 |
| 16 | Softw | 120 | 4.68 | 72.84 |
| 3 | Bio | 111 | 4.33 | 77.18 |
| 11 | Ind | 111 | 4.33 | 81.51 |
| 7 | Comp | 95 | 3.71 | 85.21 |
| 10 | Env/Eco | 95 | 3.71 | 88.92 |
| 17 | Str/Arc | 81 | 3.16 | 92.08 |
| 1 | Aer/Oce | 70 | 2.73 | 94.81 |
| 12 | Mat | 57 | 2.22 | 97.03 |
| 18 | Gen | 42 | 1.64 | 98.67 |
| 6 | Con | 34 | 1.33 | 100.00 |
Gender counts overall
| Q37 | n | pct_total |
|---|---|---|
| 1 | 1882 | 73.43 |
| 2 | 646 | 25.20 |
| 3 | 14 | 0.55 |
| 4 | 21 | 0.82 |
fill in 0s for NAs for specific items (Q1, Q3, Q5)
Drop majors with low counts (below 30 students in sample)
## # A tibble: 14 x 2
## Q29 n
## <dbl> <int>
## 1 13 596
## 2 5 469
## 3 4 390
## 4 8 238
## 5 16 118
## 6 3 110
## 7 11 108
## 8 7 93
## 9 10 90
## 10 17 78
## 11 1 70
## 12 12 55
## 13 18 40
## 14 6 31
## # A tibble: 14 x 2
## major n
## <chr> <int>
## 1 Mec 596
## 2 Che 469
## 3 Civ 390
## 4 Ele 238
## 5 Softw 118
## 6 Bio 110
## 7 Ind 108
## 8 Comp 93
## 9 Env/Eco 90
## 10 Str/Arc 78
## 11 Aer/Oce 70
## 12 Mat 55
## 13 Gen 40
## 14 Con 31
Drop majors with NA as major
## # A tibble: 14 x 2
## major n
## <chr> <int>
## 1 Mec 596
## 2 Che 469
## 3 Civ 390
## 4 Ele 238
## 5 Softw 118
## 6 Bio 110
## 7 Ind 108
## 8 Comp 93
## 9 Env/Eco 90
## 10 Str/Arc 78
## 11 Aer/Oce 70
## 12 Mat 55
## 13 Gen 40
## 14 Con 31
First perform dimension reduction using UMAP
## NULL
## [,1] [,2]
## [1,] -38.02495 -6.968480
## [2,] -31.19541 -3.773391
## [3,] 56.61304 5.437927
## [1] -38.02495 -31.19541 56.61304 56.44075 -18.30809 56.84714
## [1] -6.968480 -3.773391 5.437927 5.263140 22.195229 5.086338
Next, perform clustering with HDBSCAN
## HDBSCAN clustering for 2486 objects.
## Parameters: minPts = 120
## The clustering contains 6 cluster(s) and 103 noise points.
##
## 0 1 2 3 4 5 6
## 103 1026 173 549 150 348 137
##
## Available fields: cluster, minPts, cluster_scores, membership_prob,
## outlier_scores, hc
Join the dataframes back together again
## # A tibble: 7 x 2
## cluster n
## <dbl> <int>
## 1 1 1026
## 2 3 549
## 3 5 348
## 4 2 173
## 5 4 150
## 6 6 137
## 7 0 103
## # A tibble: 7 x 3
## cluster_time_rank cluster cluster_avg
## <int> <dbl> <dbl>
## 1 1 1 0.377
## 2 2 0 3.08
## 3 3 4 3.83
## 4 4 5 5.14
## 5 5 6 7.47
## 6 6 3 9.24
## 7 7 2 17.4
Set clustering colors for all plots
Same information but faceted with clusters
** This is a good plot for seeing that a cluster’s beliefs about effects of global warming on different populations at different times vary in a clear pattern
Broken plot
## # A tibble: 7 x 3
## cluster_time_rank n avg_Q5_total
## <int> <int> <dbl>
## 1 1 1026 3.68
## 2 2 103 3.22
## 3 5 137 3.18
## 4 3 150 3.17
## 5 4 348 2.97
## 6 6 549 2.68
## 7 7 173 2.08
## # A tibble: 10 x 6
## # Groups: Q5_item [10]
## Q5_item Q5_item_name statistic p.value parameter method
## <chr> <chr> <dbl> <dbl> <int> <chr>
## 1 Q5a Energy (supply/deman~ 9.35 1.55e- 1 6 Pearson's Chi-squ~
## 2 Q5b Disease 5.14 5.26e- 1 6 Pearson's Chi-squ~
## 3 Q5c Poverty and wealth d~ 52.0 1.88e- 9 6 Pearson's Chi-squ~
## 4 Q5d Climate change 190. 2.46e-38 6 Pearson's Chi-squ~
## 5 Q5e Terrorism and war 11.4 7.58e- 2 6 Pearson's Chi-squ~
## 6 Q5f Water supply 27.9 1.00e- 4 6 Pearson's Chi-squ~
## 7 Q5g Food availability 28.5 7.41e- 5 6 Pearson's Chi-squ~
## 8 Q5h Opp. for future gen 12.4 5.41e- 2 6 Pearson's Chi-squ~
## 9 Q5i Opp. for women and/o~ 88.0 7.97e-17 6 Pearson's Chi-squ~
## 10 Q5j Environmental degrad~ 111. 1.44e-21 6 Pearson's Chi-squ~
## # A tibble: 17,402 x 5
## student_id major cluster_time_rank Q3_item Q3_resp
## <int> <chr> <int> <chr> <dbl>
## 1 1 Ele 5 Q3a 0
## 2 1 Ele 5 Q3b 1
## 3 1 Ele 5 Q3c 0
## 4 1 Ele 5 Q3d 0
## 5 1 Ele 5 Q3e 0
## 6 1 Ele 5 Q3f 0
## 7 1 Ele 5 Q3g 0
## 8 2 Ele 4 Q3a 0
## 9 2 Ele 4 Q3b 1
## 10 2 Ele 4 Q3c 0
## # ... with 17,392 more rows
## # A tibble: 7 x 6
## # Groups: Q3_item [7]
## Q3_item Q3_item_name statistic p.value parameter method
## <chr> <chr> <dbl> <dbl> <int> <chr>
## 1 Q3a MA/MS (non-eng) 15.6 0.0164 6 Pearson's Chi-squared test
## 2 Q3b ME/MS (eng) 15.3 0.0183 6 Pearson's Chi-squared test
## 3 Q3c PhD (eng) 7.45 0.281 6 Pearson's Chi-squared test
## 4 Q3d MBA 5.53 0.478 6 Pearson's Chi-squared test
## 5 Q3e JD (law) 12.4 0.0545 6 Pearson's Chi-squared test
## 6 Q3f MD 3.96 0.682 6 Pearson's Chi-squared test
## 7 Q3g Other 11.0 0.0869 6 Pearson's Chi-squared test
## # A tibble: 39,776 x 5
## student_id major cluster_time_rank Q4_item Q4_resp
## <int> <chr> <int> <chr> <dbl>
## 1 1 Ele 5 Q4a 4
## 2 1 Ele 5 Q4b 3
## 3 1 Ele 5 Q4c 3
## 4 1 Ele 5 Q4d 2
## 5 1 Ele 5 Q4e 4
## 6 1 Ele 5 Q4f 2
## 7 1 Ele 5 Q4g 2
## 8 1 Ele 5 Q4h 4
## 9 1 Ele 5 Q4i 4
## 10 1 Ele 5 Q4j 1
## # ... with 39,766 more rows
visualize with boxplots instead of mosaicplots
## # A tibble: 16 x 6
## # Groups: Q4_item [16]
## Q4_item Q4_item_name statistic p.value parameter method
## <chr> <chr> <dbl> <dbl> <int> <chr>
## 1 Q4a Make money 24.3 4.44e-1 24 Pearson's Chi-sq~
## 2 Q4b Fame 25.4 3.82e-1 24 Pearson's Chi-sq~
## 3 Q4c Help others 56.0 2.32e-4 24 Pearson's Chi-sq~
## 4 Q4d Supervise others 25.6 3.71e-1 24 Pearson's Chi-sq~
## 5 Q4e Job sec. and opp. 45.9 4.57e-3 24 Pearson's Chi-sq~
## 6 Q4f Work w/ people 54.0 4.27e-4 24 Pearson's Chi-sq~
## 7 Q4g Invent/design 28.2 2.53e-1 24 Pearson's Chi-sq~
## 8 Q4h Develop knowledge/~ 24.3 4.44e-1 24 Pearson's Chi-sq~
## 9 Q4i Personal/fam. time 31.5 1.40e-1 24 Pearson's Chi-sq~
## 10 Q4j Easy job 68.7 3.41e-6 24 Pearson's Chi-sq~
## 11 Q4k Exciting env. 32.8 1.09e-1 24 Pearson's Chi-sq~
## 12 Q4l Solve societal pro~ 90.5 1.19e-9 24 Pearson's Chi-sq~
## 13 Q4m Use talent/abiliti~ 41.0 1.68e-2 24 Pearson's Chi-sq~
## 14 Q4n Do hands-on work 25.7 3.69e-1 24 Pearson's Chi-sq~
## 15 Q4o Apply math/sci. 23.5 4.90e-1 24 Pearson's Chi-sq~
## 16 Q4p Volunteer w/ chari~ 70.6 1.76e-6 24 Pearson's Chi-sq~
## # A tibble: 16 x 6
## # Groups: Q4_item [16]
## Q4_item Q4_item_name statistic p.value parameter method
## <chr> <chr> <dbl> <dbl> <int> <chr>
## 1 Q4a Make money 11.9 1.78e- 2 4 Kruskal-Wallis rank~
## 2 Q4b Fame 2.87 5.80e- 1 4 Kruskal-Wallis rank~
## 3 Q4c Help others 25.8 3.52e- 5 4 Kruskal-Wallis rank~
## 4 Q4d Supervise others 2.72 6.05e- 1 4 Kruskal-Wallis rank~
## 5 Q4e Job sec. and opp. 12.0 1.73e- 2 4 Kruskal-Wallis rank~
## 6 Q4f Work w/ people 3.97 4.11e- 1 4 Kruskal-Wallis rank~
## 7 Q4g Invent/design 1.30 8.61e- 1 4 Kruskal-Wallis rank~
## 8 Q4h Develop knowledge/~ 9.42 5.14e- 2 4 Kruskal-Wallis rank~
## 9 Q4i Personal/fam. time 2.69 6.11e- 1 4 Kruskal-Wallis rank~
## 10 Q4j Easy job 4.58 3.34e- 1 4 Kruskal-Wallis rank~
## 11 Q4k Exciting env. 14.6 5.63e- 3 4 Kruskal-Wallis rank~
## 12 Q4l Solve societal pro~ 72.0 8.48e-15 4 Kruskal-Wallis rank~
## 13 Q4m Use talent/abiliti~ 19.6 6.07e- 4 4 Kruskal-Wallis rank~
## 14 Q4n Do hands-on work 4.84 3.04e- 1 4 Kruskal-Wallis rank~
## 15 Q4o Apply math/sci. 0.555 9.68e- 1 4 Kruskal-Wallis rank~
## 16 Q4p Volunteer w/ chari~ 27.0 2.00e- 5 4 Kruskal-Wallis rank~
## # A tibble: 13,762 x 5
## student_id major cluster_time_rank Q2_item Q2_resp
## <int> <chr> <int> <chr> <dbl>
## 1 1 Ele 5 Q2a 4
## 2 1 Ele 5 Q2b 0
## 3 1 Ele 5 Q2c 4
## 4 1 Ele 5 Q2d 2
## 5 1 Ele 5 Q2e 2
## 6 1 Ele 5 Q2f 2
## 7 1 Ele 5 Q2g 2
## 8 2 Ele 4 Q2a 4
## 9 2 Ele 4 Q2b 1
## 10 2 Ele 4 Q2c 2
## # ... with 13,752 more rows
visualize with boxplots instead of mosaicplots
## # A tibble: 7 x 6
## # Groups: Q2_item [7]
## Q2_item Q2_item_name statistic p.value parameter method
## <chr> <chr> <dbl> <dbl> <int> <chr>
## 1 Q2a Private/Corporate 40.9 0.0170 24 Pearson's Chi-squa~
## 2 Q2b Non-profit/NGO 58.1 0.000117 24 Pearson's Chi-squa~
## 3 Q2c Gov./Public Policy 20.9 0.647 24 Pearson's Chi-squa~
## 4 Q2d Education 39.3 0.0253 24 Pearson's Chi-squa~
## 5 Q2e Entrepreneurship/Sta~ 32.1 0.123 24 Pearson's Chi-squa~
## 6 Q2f Healthcare 20.4 0.673 24 Pearson's Chi-squa~
## 7 Q2g Other 24.1 0.457 24 Pearson's Chi-squa~
The following series of analyses look at differences among the temporal discounting clusters regarding various beliefs about global warming and climate change
##
## 0 1
## 1 144 861
## 2 25 76
## 3 27 119
## 4 80 263
## 5 41 95
## 6 231 306
## 7 134 36
##
## Pearson's Chi-squared test
##
## data: cont_table
## X-squared = 382.38, df = 6, p-value < 2.2e-16
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 2466
##
##
## | climate_df$Q20a_bin
## climate_df$Q29 | 0 | 1 | Row Total |
## ---------------|-----------|-----------|-----------|
## 1 | 6 | 64 | 70 |
## | 4.245 | 1.023 | |
## | 0.086 | 0.914 | 0.028 |
## | 0.013 | 0.032 | |
## | 0.002 | 0.026 | |
## ---------------|-----------|-----------|-----------|
## 3 | 17 | 92 | 109 |
## | 0.822 | 0.198 | |
## | 0.156 | 0.844 | 0.044 |
## | 0.035 | 0.046 | |
## | 0.007 | 0.037 | |
## ---------------|-----------|-----------|-----------|
## 4 | 67 | 319 | 386 |
## | 0.849 | 0.205 | |
## | 0.174 | 0.826 | 0.157 |
## | 0.140 | 0.161 | |
## | 0.027 | 0.129 | |
## ---------------|-----------|-----------|-----------|
## 5 | 83 | 382 | 465 |
## | 0.594 | 0.143 | |
## | 0.178 | 0.822 | 0.189 |
## | 0.173 | 0.192 | |
## | 0.034 | 0.155 | |
## ---------------|-----------|-----------|-----------|
## 6 | 10 | 21 | 31 |
## | 2.629 | 0.634 | |
## | 0.323 | 0.677 | 0.013 |
## | 0.021 | 0.011 | |
## | 0.004 | 0.009 | |
## ---------------|-----------|-----------|-----------|
## 7 | 16 | 76 | 92 |
## | 0.196 | 0.047 | |
## | 0.174 | 0.826 | 0.037 |
## | 0.033 | 0.038 | |
## | 0.006 | 0.031 | |
## ---------------|-----------|-----------|-----------|
## 8 | 60 | 177 | 237 |
## | 4.236 | 1.021 | |
## | 0.253 | 0.747 | 0.096 |
## | 0.125 | 0.089 | |
## | 0.024 | 0.072 | |
## ---------------|-----------|-----------|-----------|
## 10 | 6 | 84 | 90 |
## | 7.541 | 1.818 | |
## | 0.067 | 0.933 | 0.036 |
## | 0.013 | 0.042 | |
## | 0.002 | 0.034 | |
## ---------------|-----------|-----------|-----------|
## 11 | 19 | 87 | 106 |
## | 0.123 | 0.030 | |
## | 0.179 | 0.821 | 0.043 |
## | 0.040 | 0.044 | |
## | 0.008 | 0.035 | |
## ---------------|-----------|-----------|-----------|
## 12 | 8 | 47 | 55 |
## | 0.674 | 0.162 | |
## | 0.145 | 0.855 | 0.022 |
## | 0.017 | 0.024 | |
## | 0.003 | 0.019 | |
## ---------------|-----------|-----------|-----------|
## 13 | 141 | 449 | 590 |
## | 6.080 | 1.466 | |
## | 0.239 | 0.761 | 0.239 |
## | 0.294 | 0.226 | |
## | 0.057 | 0.182 | |
## ---------------|-----------|-----------|-----------|
## 16 | 24 | 93 | 117 |
## | 0.071 | 0.017 | |
## | 0.205 | 0.795 | 0.047 |
## | 0.050 | 0.047 | |
## | 0.010 | 0.038 | |
## ---------------|-----------|-----------|-----------|
## 17 | 15 | 63 | 78 |
## | 0.002 | 0.000 | |
## | 0.192 | 0.808 | 0.032 |
## | 0.031 | 0.032 | |
## | 0.006 | 0.026 | |
## ---------------|-----------|-----------|-----------|
## 18 | 7 | 33 | 40 |
## | 0.076 | 0.018 | |
## | 0.175 | 0.825 | 0.016 |
## | 0.015 | 0.017 | |
## | 0.003 | 0.013 | |
## ---------------|-----------|-----------|-----------|
## Column Total | 479 | 1987 | 2466 |
## | 0.194 | 0.806 | |
## ---------------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 34.91984 d.f. = 13 p = 0.0008710119
##
##
##
##
## 0 1
## 1 77 941
## 2 15 87
## 3 17 132
## 4 42 304
## 5 21 114
## 6 173 373
## 7 134 36
##
## Pearson's Chi-squared test
##
## data: cont_table
## X-squared = 547.76, df = 6, p-value < 2.2e-16
## Q5d
## major 0 1
## Aer/Oce 49 21
## Bio 100 10
## Che 286 183
## Civ 249 141
## Comp 71 22
## Con 24 7
## Ele 176 62
## Env/Eco 22 68
## Gen 19 21
## Ind 83 25
## Mat 33 22
## Mec 377 219
## Softw 101 17
## Str/Arc 48 30
##
## 0 1
## 1 168 849
## 2 26 76
## 3 29 120
## 4 80 266
## 5 37 98
## 6 241 305
## 7 144 26
##
## Pearson's Chi-squared test
##
## data: cont_table
## X-squared = 403.53, df = 6, p-value < 2.2e-16
##
## 0 1
## 1 312 706
## 2 49 52
## 3 73 76
## 4 151 195
## 5 75 60
## 6 361 185
## 7 154 15
##
## Pearson's Chi-squared test
##
## data: cont_table
## X-squared = 326.39, df = 6, p-value < 2.2e-16
##
## 0 1
## 1 484 542
## 2 45 58
## 3 63 87
## 4 160 188
## 5 70 67
## 6 357 192
## 7 134 39
##
## Pearson's Chi-squared test
##
## data: cont_table
## X-squared = 105.29, df = 6, p-value < 2.2e-16
Plot for Q23
Alternative plot for Q23 scores using jitter
Differences in how to slow down climate change by cluster
Initial plot for Q24
Alternative plot for Q24
##
## 0 1
## 1 867 112
## 2 87 13
## 3 121 21
## 4 282 45
## 5 115 12
## 6 459 63
## 7 137 24
##
## Pearson's Chi-squared test
##
## data: cont_table_Q25a
## X-squared = 4.1767, df = 6, p-value = 0.6528
##
## 0 1
## 1 582 397
## 2 52 48
## 3 75 67
## 4 193 134
## 5 76 51
## 6 379 143
## 7 121 40
##
## Pearson's Chi-squared test
##
## data: cont_table_Q25b
## X-squared = 50.155, df = 6, p-value = 4.376e-09
##
## 0 1
## 1 427 552
## 2 36 64
## 3 61 81
## 4 142 185
## 5 56 71
## 6 266 256
## 7 104 57
##
## Pearson's Chi-squared test
##
## data: cont_table_Q25c
## X-squared = 35.308, df = 6, p-value = 3.756e-06
Recoding Q26 for t/F and then creating a total score out of 8
Initial plot for Q26
Alternative plot for Q26 using geom_jitter